Combining Learning and Word Sense Disambiguation for Intelligent User Profiling
نویسندگان
چکیده
Understanding user interests from text documents can provide support to personalized information recommendation services. Typically, these services automatically infer the user profile, a structured model of the user interests, from documents that were already deemed relevant by the user. Traditional keyword-based approaches are unable to capture the semantics of the user interests. This work proposes the integration of linguistic knowledge in the process of learning semantic user profiles that capture concepts concerning user interests. The proposed strategy consists of two steps. The first one is based on a word sense disambiguation technique that exploits the lexical database WordNet to select, among all the possible meanings (senses) of a polysemous word, the correct one. In the second step, a naı̈ve Bayes approach learns semantic sensebased user profiles as binary text classifiers (userlikes and user-dislikes) from disambiguated documents. Experiments have been conducted to compare the performance obtained by keyword-based profiles to that obtained by sense-based profiles. Both the classification accuracy and the effectiveness of the ranking imposed by the two different kinds of profile on the documents to be recommended have been considered. The main outcome is that the classification accuracy is increased with no improvement on the ranking. The conclusion is that the integration of linguistic knowledge in the learning process improves the classification of those documents whose classification score is close to the likes / dislikes threshold (the items for which the classification is highly uncertain).
منابع مشابه
Mining Semantically Indexed Documents for Intelligent User Profiling
Typically, personalized information recommendation services automatically infer a user profile, a structured model of the user interests, from documents the user already deemed as relevant. Traditional keyword-based approaches are unable to capture the semantics of the user interests. This work proposes a strategy consisting of two steps. The first one is a semantic indexing procedure based on ...
متن کاملWord Sense Disambiguation for Vocabulary Learning
Words with multiple meanings are a phenomenon inherent to any natural language. In this work, we study the effects of such lexical ambiguities on second language vocabulary learning. We demonstrate that machine learning algorithms for word sense disambiguation can induce classifiers that exhibit high accuracy at the task of disambiguating homonyms (words with multiple distinct meanings). Result...
متن کاملDesign and implementation of Persian spelling detection and correction system based on Semantic
Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors. Also developing Persian tools will provide Persian progr...
متن کاملTowards High-performance Word Sense Disambiguation by Combining Rich Linguistic Knowledge and Machine Learning Approaches
متن کامل
An Intelligent Personalized Service for Conference Participants
This paper presents the integration of linguistic knowledge in learning semantic user profiles able to represent user interests in a more effective way with respect to classical keyword-based profiles. Semantic profiles are obtained by integrating a näıve Bayes approach for text categorization with a word sense disambiguation (WSD) strategy based on the WordNet lexical database (Section 2). Sem...
متن کامل